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1  Introduction 


Numerical  algorithms  for  the  solution  of  ordinary  differential  equations  (ODEs)  are 
an  essential  part  of  many  scientific  computations.  As  a  consequence,  a  broad  variety 
of  such  algorithms  exists  and  their  construction  and  analysis  are  a  well-established 
fields  (see,  for  example,  [10,  11,  14,  2,  15]). 

The  existing  algorithms  can  be  grouped  into  direct  solvers  such  as  Runge-Kutta 
and  multi-step  methods  [10]  and  schemes  which  improve  the  solution  iteratively  as,  for 
example,  Richardson  extrapolation  [15]  and  deferred  correction  [6,  13,  12].  Typically, 
the  construction  of  direct  solvers  is  based  on  the  assumption  that  the  solution  of  the 
ODE  can  be  locally  approximated  by  a  polynomial,  with  the  degree  of  the  polynomial 
determining  the  order  of  accuracy  of  the  scheme  (see,  for  example,  [15]).  Explicit 
variants  of  such  schemes  are  applicable  to  non-stiff  problems;  the  ones  in  wide  use 
appear  to  be  of  orders  2  to  12.  Iterative  methods  (such  as  deferred  corrections)  have 
been  published  that  are  of  essentially  arbitrary  order. 

While  the  existing  direct  and  iterative  techniques  provide  a  reliable  tool  for  many 
applications,  they  tend  to  be  very  expensive  for  solving  ODEs  with  high  accuracy. 
Direct  methods  require  very  small  step-sizes  to  achieve  high  accuracy,  while  iterative 
methods  involve  repeated  solution  of  the  ODE  (or  its  slight  modification). 

In  the  early  stages  of  numerical  computing  there  have  been  attempts  to  construct 
direct  schemes  based  on  the  assumption  that  the  solution  which  is  to  be  computed  can 
be  represented  as  linear  combination  of  exponentials  or  trigonometric  functions  [7,  3], 
an  approach  similar  to  the  one  taken  in  this  paper.  However,  these  endeavors  never 
resulted  in  algorithms  competitive  in  practical  applications. 

In  this  paper  we  introduce  a  new  class  of  algorithms  for  solving  the  Cauchy 
problem  for  non-stiff  ordinary  differential  equations  (ODEs)  which  have  a  signifi¬ 
cantly  faster  rate  of  convergence  than  the  classical  schemes.  The  schemes  are  of  the 
predictor-corrector  type  and  are  based  on  the  approximation  of  the  solution  by  a 
linear  combination  of  a  finite  collection  of  exponentials,  chosen  via  an  appropriate 
“skeletonization”  procedure  (see  [9,  4]).  The  latter  representation  is  combined  with 
standard  numerical  tools  (such  as  least  squares  approximation)  to  construct  efficient 
extrapolation  and  integration  formulae.  The  starting  values  required  to  initialize  the 
predictor-corrector  scheme  are  computed  via  a  spectral  deferred  correction  method 
(see  [6]). 

It  should  be  observed  that  the  algorithms  presented  in  this  paper  are  not  con¬ 
vergent  in  the  classical  sense;  each  particular  scheme  quickly  converges  to  a  certain 
predetermined  precision,  after  which  its  accuracy  stops  improving  as  a  function  of  the 
number  of  nodes  in  the  discretization.  While  this  can  be  viewed  as  a  drawback  from 
the  theoretical  point  of  view,  in  practical  calculations  the  precision  is  always  fixed 
(in  double  precision  arithmetic,  it  is  about  16  digits).  Thus,  the  performance  of  a 
scheme  designed  with  16  digits  is  indistinguishable  from  that  of  a  classically  conver¬ 
gent  scheme,  as  long  as  the  calculations  are  performed  in  double  precision  arithmetic. 

Throughout  this  paper,  it  is  assumed  that  the  Cauchy  problem  to  be  solved  is  in 
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the  form 


(1) 

(2) 


<p(a)  =  <Po, 

where  t  G  [a,b\,  <p0  G  Cm,  ip  :  R  — >  Cm  and  F  :  R  x  Cm  — >  Cm  is  assumed  to 
be  sufficiently  smooth.  As  is  well  known,  any  initial  value  problem  involving  ODEs 
with  higher  derivatives  can  be  reduced  to  this  standard  form.  For  the  convenience  of 
presentation  we  assume  that  m  —  1,  as  the  generalization  of  the  construction  for  the 
case  of  arbitrary  m  is  straightforward. 

The  structure  of  this  paper  is  as  follows.  In  Section  2,  we  summarize  several  well- 
known  mathematical  and  numerical  tools  to  be  used  in  this  paper,  while  in  Section  3, 
we  discuss  the  specific  analytical  and  mathematical  apparatus  required  for  the  con¬ 
struction  of  the  numerical  solvers.  Section  4  describes  the  numerical  solvers,  and  in 
Section  5  we  report  a  number  of  numerical  experiments,  including  the  establishment 
of  stability  and  accuracy  properties. 

2  Mathematical  and  numerical  preliminaries 

In  this  section  we  summarize  several  well-known  facts  to  be  used  in  the  remainder  of 
this  paper.  All  of  these  can  be  found,  for  example,  in  [1,  10,  11,  5,  4],  Throughout 
this  paper,  the  set  {4, . . . , tn}  with  4  —  a,  t2  —  b  and  ti  —  a+(i  —  1  )h  will  be  called 
a  equidistant  discretization  of  the  interval  [a,b]  with  step-size  h.  Furthermore,  the 
derivative  of  a  function  /  :  R  — >  R  will  be  denoted  by  /',  the  conjugate  transpose  is 
denoted  by  *,  the  complex  modulus  or  absolute  value  is  denoted  by  |  •  |,  and  |j  •  H2 
denotes  the  Euclidian  norm  of  a  vector  or  the  corresponding  operator  norm  of  a 
matrix. 

2.1  Linear  predictor-corrector  methods 

A  predictor-corrector  method  solves  the  initial  value  problem  (1)  by  advancing  the 
solution  step  by  step  via  extrapolation  (predictor  step)  and  integration  (corrector 
step),  based  on  the  k  previous  time  steps.  Given  the  values  of  93  and  ip'  at  the  k 
equidistant  nodes  4,4,  ■  ■ .  ,4  the  predictor  step  approximates  the  value  of  ip  at  the 
next  time  step  tk+ 1  by  the  linear  extrapolation  formula 

k 

^(4+i)  =  IpMU)  +  Pk+iF{U ,  ip{U))} ,  (3) 

i= 1 

where  pi,  ■  ■  ■  ,P2k  £  R.  The  result  is  used  to  compute  the  value  of  <p'(tk+ 1)  via  equa¬ 
tion  (1).  The  corrector  step  then  recomputes  <p(tk+ 1)  by  the  integration  formula 

k 

<p(tk+ 1)  =  X]  +  ck+iF(ti,  ip(ti ))]  +  c2k+iF(tk+1,  93(4+1)),  (4) 

2=1 
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where  cy, . . .  ,c2k+ 1  G  R,  and  the  value  of  p'(tk+ 1)  is  updated  via  equation  (1).  One 
predictor  step  can  be  followed  by  several  corrector  steps. 

In  accordance  with  conventions  in  the  literature,  we  refer  to  a  numerical  scheme 
which  solves  equation  (1)  by  computing  +  1)  from  <p(ti), . . . ,  (p(tk)  via  one  pre¬ 
dictor  step  (3)  and  m  corrector  steps  (4)  as  £;-step  PE(CE)m  method  (see  [15]).  For 
their  initialization  fc-step  PE(CE)m  methods  require  k  starting  values. 

Remark  1  Most  predictor-corrector  methods  in  use  ignore  the  values  of  p  in  formu¬ 
lae  (3),  (4)  and  use  only  the  information  about  the  derivative  F(t,p).  In  this  paper, 
however,  we  will  use  formulae  (3),  (4)  in  their  full  generality. 

Remark  2  The  most  popular  predictor-corrector  methods  appear  to  be  Adarns- 
Bashforth- Moulton  schemes  (see,  for  example,  [10,  15]).  The  predictor  and  corrector 
formulae  in  these  schemes  are  based  on  approximating  p'  by  a  polynomial. 


2.2  Accuracy  and  stability  of  numerical  ODE  solvers 

The  two  characteristics  which  are  generally  used  to  describe  the  performance  of  a 
numerical  ODE  solver  are  its  order  of  accuracy  and  its  stability  region  (see,  for  ex¬ 
ample,  [10,  11,  14]).  Let  p  denote  the  solution  to  the  initial  value  problem  (1),  (2), 
obtained  via  a  numerical  solver  at  the  equidistant  discretization  t\, ...  ,tn  of  [a,  b]  with 
step-size  h.  The  underlying  method  is  said  to  be  of  order  of  accuracy  p,  if  for  any 
sufficiently  smooth  F  in  equation  (1)  there  exists  a  constant  M  >  0  such  that 

I  <p(b)-ip(b)\<Mh>;  (5) 

for  any  choice  of  h,  that  is  sufficiently  small. 

The  stability  of  a  numerical  ODE  solver  is  generally  determined  by  analyzing  its 
behavior  on  test  problems  of  the  form 

=  A< p(t),  (6) 

¥>(0)  =  1  (7) 


where  A  G  C.  A  numerical  scheme,  which  computes  the  solution  p  to  this  problem  at 
the  equidistant  discretization  1 1, . . . ,  tn  with  step-size  one,  is  said  to  be  stable  for  A, 

if 

<p(U)  <  1,  (8) 

for  i  =  1, ...  ,n.  The  set  of  values  of  A  for  which  a  numerical  method  is  stable  is 
called  its  stability  domain. 

The  stability  of  a  k-step  PE(CE)m  method,  in  particular,  can  be  determined 
by  computing  the  eigenvalues  of  the  matrix  B  which  corresponds  to  the  the  linear 
mapping  B  :  Cfc  — >  <Ck,  defined  by 


Vih) 

<p(h) 

<p(tk+ i) 


(9) 
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where  bp(t\), . . . ,  bp(tk)  are  arbitrary  starting  values  and  bp{tk+ 1)  is  the  result  of  using 
the  predictor-corrector  method  to  take  a  step  of  length  one  for  the  solution  of  equa¬ 
tion  (6).  If  the  biggest  eigenvalue  of  B  is  less  than  one  then  the  method  is  stable 
for  A.  The  matrix  B  is  of  the  form 


/  0  1  \ 

0  1 

\W  b2  ■■■  bk  ) 


(10) 


where  the  coefficients  b\, ...  ,bk  G  C  depend  on  the  given  predictor-corrector  method 
and  on  A.  Specifically,  bi  =  <pi(tk+ 1),  where  <fi(tk+ 1)  is  the  result  of  applying  the  given 
predictor-corrector  method  to  the  starting  values  (pi(ti), . . . (pi(tk),  chosen  as 


1,  if  j  =  i 
0,  if j^i 


(11) 


Remark  3  The  matrix  B  is  the  transpose  of  a  companion  matrix  and  has  the  char¬ 
acteristic  polynomial  (see,  for  example,  [8]) 

—b\  H — b2z  H — b^z2  +  •  •  ■  —  bkzk  1  +  zk  =  0.  (12) 

In  the  literature  (see,  for  example,  [11,  14])  the  stability  of  a  multi-step  method  is 
often  determined  by  checking  if  the  largest  root  of  equation  (12)  is  less  or  equal  to 
one.  Obviously,  this  is  equivalent  to  the  stability  criterion  given  above. 


Finally,  suppose  as  before,  that  bp  denotes  the  solution  to  (6),  (7),  obtained  via 
a  numerical  method  at  the  equidistant  discretization  ti, . . .  ,tn  of  [a,b\  with  step-size 
one.  For  a  specified  precision  e  we  define  the  method’s  accuracy  domain  to  be  the  set 
of  all  A,  for  which 


'EL 

ELIvMI2 


<  e, 


(13) 


where  tp  :  C  i— >  C,  z  i— >  eAz  denotes  the  analytical  solution  to  equations  (6),  (7). 


Remark  4  Obviously,  increasing  the  number  of  discretization  steps  of  a  given  in¬ 
terval  will  increase  the  frequency  domain  to  which  a  numerical  method  applies.  In 
general,  changing  the  step-size  of  a  numerical  scheme  for  the  solution  of  the  prob¬ 
lem  (6),  (7)  from  one  to  h  is  equivalent  to  changing  the  exponent  A  in  equation  (6) 
to  A h,  as  the  dilation  r  —  t/h  transforms  equation  (6)  into 

J^(r)  =  A h  <p(t).  (14) 

In  the  case  of  the  variable  step-size  h,  stability  and  accuracy  regions  therefore  apply 
to  the  product  A h. 


4 


Remark  5  A  standard  way  of  describing  the  performance  of  a  numerical  scheme  for 
a  given  accuracy  is  in  terms  of  the  number  of  steps  per  wavelength.  If  the  accuracy 
domain  of  the  scheme  for  the  precision  £  contains  the  semi-disk 

S=  {A  g  C  |  Re(A)  <  0,  |A|  <  |A|},  (15) 

then  the  scheme  is  said  to  require 

2tt 

steps  per  wavelength  to  achieve  the  accuracy  e. 


2.3  SVD  and  least  squares 

Given  the  linear  system 

Ax  =  b,  (17) 

where  A  G  Cmxn  and  b  G  Cm,  any  vector  x  G  Cn  which  minimizes 

\\Ax  -  b\\2  (18) 

is  called  a  least  squares  solution  of  system  (17).  If  X  is  the  set  of  least  squares 
solutions  of  (17)  then  X  contains  one  unique  element  of  minimum  L2-norm 

Xlm  =  argmin{|  |x|  |2  |  x  G  A"},  (19) 

which  is  called  the  minimum  norm  least  squares  solution.  The  singular  value  decom¬ 
position  (SVD)  provides  an  explicit  expression  for  If  A  =  UXV*  is  the  SVD  of 

A,  i.e.  U  G  Cmxm  and  V  G  Cnxn  are  orthonormal  and  E  G  Rmxn  is  diagonal,  then 

xlm  =  V  — D,  (20) 

where  (ui, . . . ,  urn)  are  the  columns  of  I/,  (iq, . . . ,  vn)  are  the  columns  of  V,  (a i, . . . ,  crr ) 
are  the  positive  diagonal  elements  of  E,  and  r  is  the  rank  of  A. 

In  numerical  applications  it  is  reasonable  to  define  the  rank  of  matrix  A  to  pre¬ 
cision  e.  Given  £  >  0  and  the  singular  values  (oq , . . . ,  amin(m)n) )  of  A,  sorted  in 
decreasing  order,  the  numerical  rank  of  precision  £  of  A  is  defined  as  r  G  IN,  such 
that  ar  >  £  and  <jr+i  <  e.  Accordingly,  we  define  the  minimum  norm  least  squares 
solution  of  precision  £  of  system  (17)  as 

r(£)  *b 

xLm{s)  =  E  —Vi,  (21) 

tr 

where  r(e)  is  the  numerical  rank  of  precision  £  of  A  and  all  other  quantities  are  as  in 
equation  (20). 

Given  an  algorithm  for  the  computation  of  the  SVD,  the  minimum  norm  least 
squares  solution  xlm{s)  can  be  computed  via  (21).  However,  besides  this  obvi¬ 
ous  SVD-based  algorithm,  there  exist  various  other  schemes  for  the  computation  of 
xlm{z),  which  are  usually  faster;  an  example  is  the  complete  orthogonal  factorization 
described  in  Chapter  5  of  [8]. 
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2.4  Matrix  interpolation  (Skeletonization) 

The  following  lemma  will  be  used  for  the  approximation  of  exponentials  of  bounded 
frequency.  It  is  often  referred  to  as  skeletonization  and  is  given  by  [9,  4]  in  slightly 
more  general  form. 

Lemma  6  (Skeletonization)  Given  a  matrix  A  e  Cmxn  with  columns  (ai, . . . ,  an), 
and  k  G  IN  such  that  1  <  k  <  min(m,n),  there  exists  a  selection  of  k  columns  of  A 
such  that 

A  =  (an i, . . . ,  ank)T  +  X,  (22) 

where  all  elements  ofTE  Cfcxm  have  magnitude  less  than  one  and  the  operator  norm 
of  X  e  Cmxn  is  bounded  by  the  ( k  +  1  )th  singular  value  Gk+i(A)  of  A  as  follows 

||A"||2  <  ak+i(A)^/l  +  k(min(m,  n)  -  k).  (23) 


In  other  words,  the  lemma  states  that  A  can  be  approximated  by  interpolating  be¬ 
tween  k  of  A’s  columns,  if  the  ( k  +  l)th  singular  value  of  A  is  small,  i.  e.  if  k  is  close 
to  the  numerical  rank  of  A.  The  fact  that  all  elements  in  the  interpolation  matrix  T 
have  magnitude  less  than  one  guarantees  that  this  interpolation  is  stable. 

Remark  7  The  factorization  (22)  can  typically  be  computed  in  0(nmk)  operations 
(the  worst  case  is  0(mnl)  operations,  where  l  =  min(m,  n))  by  the  algorithms  de¬ 
scribed  in  [9,  4], 


2.5  Spectral  deferred  correction  methods 

Spectral  deferred  correction  methods  solve  the  initial  value  problem  (1),  (2)  by  start¬ 
ing  with  a  low  accuracy  approximation  to  the  solution  and  improving  it  iteratively 
(see  [6,  13,  12]).  During  each  iteration  the  error  £  of  the  current  approximation  p3  is 
computed  via  the  formula 

e(t)  =  <p(a)+  f  F(T,(p3(r))dT  -  ip3(t),  (24) 

J  a 

which  is  based  on  the  integral  form  (Picard  equation)  of  equations  (1),  (2), 

¥>(*)  =  ¥>(«)+/  F(T,<p(T))dT.  (25) 

J  a 

The  approximation  <p3{t)  is  then  updated  via  the  formula 

<^+1W  =  V°(f)  +  (26) 


where  q(t)  has  to  satisfy  the  integral  equation 

7(f)  =  /  [F('r,<Pi(T)+'y(T))-F(T,<p3(T))]dT  +  £(t),  (27) 

J  a 
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which  results  from  combining  (25)  with  (24)  and  assuming  that  tfij+i  satisfies  (25). 

Spectral  deferred  correction  algorithms  solve  equation  (27)  by  standard  ODE 
solvers  such  as  Euler’s  method,  and  evaluate  the  integral  in  (24)  by  spectral  inte¬ 
gration  (see  [6]).  A  typical  choice  for  the  discretization  of  the  integration  interval 
are  Gaussian  or  Tchebycheff  nodes  (see  [6,  13]).  A  discussion  of  the  stability  and 
convergence  properties  of  these  schemes  can  be  found  in  [6,  12].  In  Section  3.4  we 
construct  a  spectral  deferred  correction  scheme  for  equidistant  nodes. 

2.6  A  second-order  Runge-Kutta  method 

The  second-order  Runge-Kutta  method  solves  the  initial- value  problem  (1),  (2)  on 
the  interval  [to,  to  +  L]  by  taking  a  sequence  of  n  steps: 

ti- j-1  t-i  T  h , 

ki+ 1  =  h  F(ti  +  h,  < p(ti )  +  ki), 

tpiU- t-i)  =  <p(tj)  +  —  (ki  +  ki+ 1).  (28) 

where  h  =  L/n  and  ko  =  hF(t0,  (po)-  This  algorithm  requires  exactly  n+ 1  evaluations 
of  the  function  F  and  its  order  of  accuracy  is  two. 


2.7  Bisection  and  the  secant  method 

The  bisection  method  and  the  secant  method  are  classical  root  finding  schemes  (see, 
for  example,  [5]),  which  we  employ  for  the  computation  of  the  stability  and  accuracy 
domains.  Specifically,  the  bisection  method  Ends  the  root  of  a  function  /  within  the 
interval  [aq,^]  iteratively,  by  testing  whether  /  changes  its  sign  to  the  left  or  to  the 
right  of 

Xi  +  x2 


If  f(x3)  has  the  same  sign  as  f{x2)  the  method  sets  x2  =  a;  3  and  repeats,  else  it 
sets  xi  =  aq  and  repeats.  The  bisection  stops  when  aq  and  x2  are  sufficiently  close. 
Evidently,  if  [aq,aq]  contains  one  single  root,  bisection  requires  0(log(e))  steps  to 
obtain  the  root  with  precision  e. 

The  secant  method  is  an  iterative  scheme  as  well,  which  requires  two  starting 
values.  It  computes  the  root  of  the  function  /  by  approximating  /  by  its  secant 
through  the  points  X\  and  x2  which  have  been  obtained  during  the  previous  two 
iterations.  The  secant  through  aq  and  x2  intersects  the  a;- axis  at 

=  f(x2)a q  -  f(x1)x2 
13  f(x2)-f(xi) 

If  f(x 2)  >  f(x  1)  the  secant  methods  sets  x\  =  x3  and  repeats,  else  it  sets  x2  =  x3 
and  repeats.  The  order  of  convergence  of  the  secant  method  is  approximately  1.6, 
however,  the  secant  method  is  not  guaranteed  to  converge. 


(30) 
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2.8  The  maximum  modulus  principle 

The  following  result  can  be  found  in  any  standard  textbook  on  complex  analysis. 


Lemma  8  (Maximum  modulus  principle)  If  D  C  C  is  a  bounded  connected  set 
with  boundary  dD,  U  C  C  is  an  open  set  such  that  (D  U  dD)  C  U  and  the  function 
f  :  U  — >  C  is  analytic  on  U  then  the  maximum  value  of  \  f\  on  (D  U  dD)  occurs  on 
the  boundary,  i.  e. 


max  |/| 


max  |/| . 


(31) 


3  Analytical  Apparatus 

In  this  section,  we  discuss  the  construction  of  the  extrapolation  and  integration  for¬ 
mulae  used  by  the  predictor-corrector  methods  of  this  paper  (Sections  3.1,  3.1),  an 
efficient  representation  of  functions  of  bounded  frequency  (Section  3.3),  and  spectral 
methods  for  functions  of  bounded  frequency  on  equidistant  nodes  (Section  3.4). 

3.1  The  predictor  step:  Extrapolation  of  exponentials 

The  predictor  step  of  a  predictor-corrector  scheme  approximates  the  solution  of  the 
ODE  at  the  next  time  step  by  extrapolating  the  solution  based  on  its  values  and 
derivatives  at  previous  time  steps.  In  this  section  we  construct  an  extrapolation 
formula,  which  is  based  on  the  assumption  that  the  ODE  solution,  which  is  to  be 
extrapolated,  is  a  linear  combination  of  the  finite  set  of  exponentials  eXlt,  eX2t, . . . ,  eXnt , 
where  \  G  0  (The  selection  of  the  A*  is  described  in  Section  3.3  below). 

Suppose  that  the  values  and  derivatives  of  the  functions  which  are  to  be  extrapo¬ 
lated  are  given  at  an  equidistant  discretization  •  •  •  ,4-  of  the  interval  [-1,1]  with 
step-size  h,  and  they  are  to  be  evaluated  at  tk+i  =  1  +  h  via  the  formula 

k 

<p(tk+ 1)  =  \pMU)  +  Pk+i<p'(U)] ,  (32) 

2—1 

where  p±, . .  .p2fc  £  H-  Formula  (32)  is  valid  for  all  functions  p  of  the  form  ip(t)  = 
YTi=l  cqeXit  with  cq  G  C,  if  p  =  (pi, . . .  ,P2k)T  satisfies  the  linear  system 

Ap  =  b,  (33) 


where  the  elements  of  A  G  Cnx2fc  are 

a  f  eXit\  i  =  =  1, . . . ,  fc 

lj  \  X  ieXiti~k,  i  =  1, ...  j  =  k  +  1, ...  ,2k  ’ 

and  the  components  of  b  G  C"  are 


(34) 


bi  =  eXitk+1 


(35) 


In  general,  the  vector  p  can  be  approximated  by  the  minimum  norm  least  squares 
solution  (21)  of  system  (33): 


7 ,*h 

POO  =  E  4»i, 

i= 1 


CTi 


(36) 


where  e  >  0  is  a  fixed  precision  value,  and  r(e),  ai:  Ui ,  and  are  as  in  equation  (21). 


Remark  9  The  choice  of  the  parameters  n,  k,  and  c,  for  the  construction  of  the 
extrapolation  formula  will  affect  the  stability  and  accuracy  properties  of  the  resulting 
predictor-corrector  scheme.  If  2k  >  n  then  the  accuracy  of  the  extrapolation  for¬ 
mula  (32)  is  of  order  e.  Furthermore,  if  n  and  k  are  hxed,  ||p(e)||2  tends  to  grow 
as  £  is  decreased,  which  indicates  that  increasing  e  will  improve  the  stability  of  the 
resulting  schemes.  Similar,  increasing  the  number  of  steps  k  will  also  decrease  j  |p(er)  1 1 2 
and  will  tend  to  improve  the  stability. 

We  have  investigated  the  stability  properties  of  the  predictor-corrector  schemes  of 
this  paper  numerically  and  will  discuss  the  results  in  Section  5. 


3.2  The  corrector  step:  Integration  of  exponentials 

After  using  the  result  for  (p(tk+ 1)  from  the  predictor  step  to  obtain  (p'(tk+ 1)  via  equa¬ 
tion  (1),  the  corrector  step  recomputes  the  value  of  (p(tk+i)  via  the  formula 

k 

<p{tk+ 1)  =  ^2  [c^(^)  +  ck+i<p'(ti)]  +  c2k+i<p'(tk+i),  (37) 

i=  1 

where  ci, . . . ,  c^k+i  £  R.  As  before,  we  assume  that  ip(t)  =  Y^i= 1  with  cq  G  C. 

Hence,  c  =  (ci, . . . ,  C2k+i Y  has  to  satisfy 


Ac  =  b, 


(38) 


where  b  is  as  in  equation  (35),  and  the  elements  of  A  G  Cnx2fc+1  are 


f  Aij,  i  =  1, . . .  ,n;  j  =  1, . . .  ,2k 

\  \,eXttk+1 ,  i  —  1, . . . ,  n;  j  —  2k  +  1 


(39) 


The  vector  c  can  be  approximated  by  the  minimum  norm  least  squares  solution  (21) 
of  system  (38). 


Remark  10  As  in  the  predictor  case,  the  choice  of  the  parameters  n,  k  and  £  will 
affect  the  stability  and  accuracy  properties  of  the  resulting  predictor-corrector  scheme 
(See  Remark  9).  In  general,  the  integration  during  the  corrector  step  is  a  significantly 
more  stable  procedure  than  the  extrapolation  during  the  predictor  step;  additional 
corrector  steps  will  therefore  typically  improve  the  stability  of  the  resulting  predictor- 
corrector  schemes.  Section  5  contains  a  discussion  of  these  stability  issues,  which  is 
based  on  numerical  experiments. 
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Remark  11  Both  the  predictor  formula  (32)  and  the  corrector  formula  (37)  make  use 
of  the  values  and  the  derivatives  of  the  function  p .  While  it  would  be  feasible  to  base 

the  predictor  step  and  the  corrector  step  only  on  the  derivatives  p'(tf), _ ,  p'(tk), 

which  is  the  approach  of  most  classical  predictor-corrector  schemes,  our  approach  of 
always  using  the  information  about  the  values  and  the  derivatives  leads  to  significantly 
more  stable  formulae. 

3.3  Skeletonization  of  a  semi-disk  in  C 

In  this  section  we  describe  how  Ai, . . . ,  An  in  equations  (34),  (39)  can  be  selected  such 
that  (32),  (37)  will  be  satisfied  for  all  functions  f\(x)  =  e~Xx,  for  which  A  lies  within 
the  complex  semi-disk 


Sr  =  { A  G  <D  j  Re(A)  <  0,  |A|  <  r}  .  (40) 

The  discussion  proceeds  in  two  steps.  First,  Lemma  12  establishes  that  it  is  suf¬ 
ficient  to  discretize  the  boundary  of  Sr  (denoted  by  dSr).  Then,  we  construct  the 
discretization  of  dSr  by  employing  Lemma  6. 

Lemma  12  Given  the  initial  value  problem 

cp'{t)  =  Xp ,  (41) 

¥>(0)  =  Po,  (42) 

let  ip\  denote  its  solution  and  p>x  its  approximate  solution  computed  via  a  k-step 
PE(CE)m  method  as  described  in  Section  2.1  (assuming  that  the  k  required  start¬ 
ing  values  are  given).  Furthermore,  let  D  denote  a  connected  bounded  region  in  the 
complex  plane  and  d D  its  boundary. 

If  for  any  fixed  tel  and  e  >  0  the  inequality 

\px{t)  -  px(t)\  <  e  (43) 

holds  for  all  A  G  dD,  then  (43)  also  holds  for  all  A  G  D. 

Proof.  For  any  fixed  value  of  t  a  PE{CE)m  method  computes  px(t)  by  a  finite 
number  of  additions  and  multiplications  of  linear  functions  of  A.  The  function  gt  : 
C  — »  C  defined  by  A  i— >  p\(t)  is  therefore  a  polynomial  in  A  and  hence  an  analytic 
function  of  A.  Since  the  function  ft  :  C  — >  C  defined  by  A  i— >  ext  is  also  an  analytic 
function  of  A,  the  error  term  (<px(t)  —  <P\(t))  is  an  analytic  function  of  A  as  well.  The 
lemma  follows  therefore  from  the  maximum  modulus  principle  8.  □ 

Lemma  12  implies  that  it  is  sufficient  to  construct  a  discretization  the  boundary 
of  Sr,  which  we  accomplish  as  follows.  Since  the  result  of  one  step  of  a  PE(CE)m 
method  depends  linearly  on  the  solution  at  the  previous  nodes,  any  method  valid 
for  the  functions  eXlt, . . . ,  eXnt  will  also  be  valid  for  any  linear  combination  of  these 
functions.  In  particular,  let  eXlt, . . . ,  eXnt  be  a  collection  of  functions  such  that  for  a 
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10-16 

1^ 

O 
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O 

1 

CO 
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n 

4 

12 

18 

24 

30 

38 

54 

Table  1:  Number  n  of  nodes  for  different  accuracies  8  (Skeletonization  of  S3.15) 

specified  precision  8  and  for  any  A  G  dSr  there  exist  coefficients  ci, . . . ,  cn  G  C  such 
that 

n 

ext  -  ^  <  S>  (44) 

i=  1 

for  all  t  G  [—1, 1].  If  (43)  is  satisfied  for  A  =  Ai,  A2, . . . ,  An,  then 

\<Px(t)  -tp\(t)\  <  £  +  5,  (45) 

for  all  A  G  d Sr  (triangle  inequality).  Therefore,  it  is  sufficient  to  select  a  set  of  expo¬ 
nentials  which  can  generate  all  exponentials  |eA*|  A  G  <9SV}  up  to  a  finite  precision  A 
via  linear  interpolation.  Since  exponentials  of  the  form  ext  are  smooth  functions 
of  A,  it  is  clear  that  such  a  selection  of  exponentials  exists;  a  specific  example  is  a 
sufficiently  dense  equidistant  discretization  of  dSr,  where  the  arclength  between  two 
neighboring  nodes  in  the  discretization  is  the  same  for  all  neighboring  pairs. 

Lemma  6  provides  a  tool  to  choose  an  efficient  discretization  of  the  type  dis¬ 
cussed  above.  Starting  with  an  equidistant  discretization,  we  employ  the  factoriza¬ 
tion  in  Lemma  6  to  significantly  reduce  the  number  of  required  nodes.  Suppose 
that  Ai, . . . ,  Ajv  and  t\, ... ,  £m  are  sufficiently  dense  equidistant  discretizations  of  dSr 
and  [-1,1],  respectively,  and  the  factorization  (22)  of 

gAi t\  gA2*i  .  .  .  gAjvU 

gAl*2  gA2*2  .  .  .  gAiV*2 

gAl*M  gA2 tM  .  .  .  g^NtM 

with  error  term  ||X||2  <  <5  yields  the  k  columns  with  indices  {ii, . . .  ,ik}-  Then  the 
exponentials  eAii*, . . .  ,eAi*A  span  |eAt|  A  G  dSr }  up  to  precision  5  and  are  therefore 
an  appropriate  representation  of  {eAf|  A  G  dSr}  (and  hence  of  { ext |  A  G  S',.}),  when 
the  precision  5  is  required. 

For  the  values  of  6  and  r  which  are  of  interest  in  this  paper,  the  above  procedure, 
often  referred  to  as  skeletonization,  results  in  a  number  of  nodes  between  12  and  54, 
as  illustrated  in  Table  1,  which  shows  the  number  of  nodes  n  that  are  required  for 
the  skeletonization  of  Sr  with  r  =  3.15  for  different  accuracies  5.  Figure  1  shows  the 
skeletonization  of  S3. 15  for  the  precisions  8  =  10~10  and  8  =  10-16. 

Remark  13  Since  dSr  is  symmetric  with  respect  to  the  real  axis,  it  is  sufficient  to 
discretize  the  upper  half  of  dSr  (Im(A)  >  0)  and  choose  the  discretization  of  the  lower 
half  of  dSr,  such  that  it  is  symmetric  to  the  upper  half.  In  order  to  ensure  that  this 
leads  to  an  efficient  discretization  close  to  the  real  axis,  the  discretization  of  the  upper 
half  of  dSr  can  be  constrained  to  include  Ai  =  0  and  A2  =  —  r\J — 1. 
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Figure  1:  Skeletonization  of  Sr  with  r  =  3.15 


Remark  14  The  construction  of  predictor  and  corrector  formulae  (32),  (37)  makes 
use  of  the  assumption  that  the  numerical  solution  to  equation  (6)  is  sampled  at  an 
equidistant  discretization  ti,...,tk  of  [-1,1]  with  step-size  h  —  2/(k  —  1)  and  is  of 
the  form  f\  =  ext,  where  A  G  Sr.  If  the  resulting  formulae  are  applied  to  a  different 
equidistant  sequence  of  nodes  ti, . . .  ,4  with  the  new  step-size  h  =  ah ,  they  will  be 
valid  for  all  solutions  of  the  form  f\  =  ext,  where  A  G  Sr/a.  In  other  words,  halving  the 
step-size  will  double  the  maximal  frequency  of  the  functions  to  which  the  predictor- 
corrector  scheme  is  applicable.  This  follows  from  the  fact  that  (1)  is  invariant  under 
translations  and,  furthermore,  the  dilation  t  —  at  transforms  equation  (6)  into 

y^r)  =  A?(t).  (47) 

Remark  15  The  choice  of  the  parameter  r  in  (40)  determines  the  maximal  frequency 
of  the  functions  to  which  the  predictor  and  corrector  formulae  (32),  (37)  apply  for 
the  fixed  step-size  h  =  2/k.  By  construction,  the  number  of  steps  per  waveleiigth 
to  achieve  the  optimal  accuracy  of  the  resulting  scheme  will  therefore  be  approxi¬ 
mately 

3.4  Spectral  deferred  correction  on  equidistant  nodes 

In  this  section  we  describe  a  spectral  deferred  correction  method  to  be  used  to  com¬ 
pute  the  starting  values  for  the  predictor-corrector  scheme.  The  deferred  correction 
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method  will  be  constructed  on  the  equidistant  discretization  t\,t2,  ■  ■  ■  ,tk  of  [—1,1] 
with  step-size  h.  Furthermore,  we  will  assume  that  the  desired  solution  to  (1),  (2)  is 
a  linear  combination  of  eAlt,  eAzt, . . . ,  eXnt,  where  Ai, . . . ,  An  is  a  discretization  of  dSr 
as  discussed  in  the  previous  section. 

Given  these  assumptions,  the  value  for  s(ti)  as  defined  in  (24)  can  be  evaluated 
by  the  quadrature  rule 


'-1 


ip(T)rdr  »  5^¥>(<i)ry, 


(48) 


i— 1 


where  r%3  G  R.  The  corresponding  weight  matrix  r  =  (rl3)  has  to  satisfy  the  linear 


system 


Ar  =  b, 


(49) 


where  A  G  <Enxk, 


A 


I  ti  gA2^2 


gApt/g  \ 

gA2  tfc  I 


V 


0Xntl 


g A  71^2 


3  A  n^k 


1 


(50) 


and  b  G  Cnxfc 


(  f_\  eXlTdr  ■  ■  •  f*_\  eXlTdr  \ 

/l“eAsT* 


(51) 


V  f- 1  eXnTdr  ■  ■  ■  eXnTdr  / 

Approximating  each  column  of  the  matrix  r  by  the  minimum  norm  least  squares 
solution  (21)  of  precision  e  of  (49)  results  in  a  quadrature  formula  whose  accuracy 
is  of  order  e.  Since  the  integral  and  its  approximation  are  analytical  functions  of  A, 
it  follows  from  the  maximum  principle  (Lemma  8)  that  (48)  also  holds  for  all  linear 
combinations  of  {eA*  j  A  G  S',.}  (see  Lemma  12). 

Equations  (25)  and  (27)  can  be  solved  by  applying  a  second  order  Runge-Kutta 
scheme  (28).  In  particular,  the  second  order  Runge-Kutta  scheme  for  the  solution  of 
equation  (27)  with  step-size  h  is  given  by 


7*+i  =  7*  +  hG(tu  7 i)  +  e(ti+l)  -  e(U), 

7*+i  =  7*  +  2  7 i)  +  7j+i)]  +  s(ti+ 1)  —  s(ti),  (52) 

where  G(t,  7)  =  F(t,  t )  +  7)  —  F(t,  t ))  and  7 ,  denotes  7(G). 


Remark  16  The  resulting  deferred  correction  scheme  makes  use  of  the  assumption 
that  the  numerical  solution  to  equation  (1)  is  obtained  at  an  equidistant  discretization 
ti, . . . ,  tk  of  [-1,1]  with  step-size  h  =  2/{k—l)  and  is  of  the  form  f\  =  ext,  where  A  G  Sr. 
If  the  scheme  is  applied  to  on  the  nodes  ti, ...  ,tk  with  the  new  step-size  h  =  ah,  it 
will  be  valid  for  all  solutions  of  the  form  f\  =  ext,  where  A  G  Sr/a  (see  Remark  14). 

Remark  17  The  stability  and  accuracy  properties  of  the  deferred  correction  schemes 
discussed  in  this  section  have  been  established  numerically  and  are  described  in  Sec¬ 
tion  5. 
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4  Description  of  the  Algorithm 

This  section  contains  a  description  of  the  algorithms  that  are  the  principal  goal  of  this 
paper.  The  informal  description  in  the  section  below  is  followed  by  a  more  detailed 
description  in  Section  4.2. 

4.1  Informal  description 

Several  precomputation  steps  of  the  algorithm  require  a  discretization  of  Sr  as  defined 
in  (40),  which  can  be  precomputed  via  factorization  (22)  of  matrix  (46).  As  a  first  step 
the  algorithm  uses  a  spectral  deferred  correction  scheme  to  compute  the  solution  to  the 
initial  value  problem  (1),  (2)  at  k  equidistant  nodes  with  step-size  h,  which  are  used 
as  the  starting  values  for  the  predictor-corrector  scheme.  The  initial  solution  for  the 
deferred  correction  iterations  is  obtained  via  a  second  order  Runge-Kutta  scheme  (28), 
after  which  the  solution  is  improved  iteratively  via  (24),  (26),  (27),  where  7 (t)  is 
computed  via  the  Runge-Kutta  scheme  (52)  and  the  integral  in  (24)  is  evaluated  via 
the  approximation  (48).  The  integration  vectors  ri,...,T7  are  the  minimum  norm 
least  squares  solution  (21)  of  (49),  and  can  be  precomputed.  The  iterations  stop 
when  at  all  k  nodes,  7 (t)  in  equation  (27)  is  less  than  a  pre-set  accuracy  e. 

After  the  initial  k  values  of  the  solution  (pit )  have  been  computed,  the  algorithm 
advances  the  solution,  taking  steps  of  size  h ,  via  a  k-step  PE(CE)m  method  (see 
Section  2.1).  During  each  time  step  one  predictor  step  is  followed  by  m  corrector 
steps.  The  predictor  and  corrector  steps  are  based  on  the  assumption  that  the  solution 
to  (1)  can  be  represented  by  a  linear  combination  of  exponentials  ext,  where  A  lies  in 
the  complex  semi-disk  Sr  (40).  In  particular,  the  predictor  step  approximates  (p{tk+ 1) 
via  (32),  where  the  vector  p  in  (32)  is  the  minimum  norm  least  squares  solution  of  (33) 
and  can  is  precomputed  once  and  for  all. 

After  the  predictor  step,  the  approximate  value  of  (pitk+i )  is  used  to  compute 
ip'(tk+i)  via  (1).  The  corrector  step  then  re-computes  (pitk+i)  via  formula  (37),  where 
the  vector  c  in  (37)  is  the  minimum  norm  least  squares  solution  of  (38)  and  is  pre¬ 
computed  once  and  for  all.  The  corrector  step  is  repeated  m  times. 

If  the  interval  [a,  b]  is  discretized  into  N  points  then  the  predictor  corrector  scheme 
requires  (m  +  1)  •  N  evaluations  of  the  function  F  in  (1),  where  m  is  number  of  times 
the  corrector  step  is  applied  after  each  predictor  step. 

Remark  18  If  (1),  (2)  is  a  system  of  ODEs,  i.  e.  the  dimension  of  (pit)  is  bigger 
than  one,  the  same  algorithm  can  be  applied  by  performing  all  steps  componentwise. 

4.2  Detailed  description 

In  this  section,  we  describe  the  algorithm  of  this  paper  in  some  detail.  The  presenta¬ 
tion  proceeds  in  two  parts:  Algorithm  1.1  is  the  procedure  for  computing  the  starting 
values  for  the  predictor-corrector  method  via  a  spectral  deferred  correction  scheme 
based  on  equidistant  nodes;  Algorithm  1.2  is  the  main  algorithm,  which  is  a  k- step 
PE(CE)m  method. 
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Algorithm  1.1  (Spectral  Deferred  Correction) 

Comment  [  Algorithm  1.1  computes  the  solution  to  initial  value  problem  (1),  (2)  at  a 
given  equidistant  discretization  ti, ... ,tk  via  a  spectral  deferred  correction  scheme.  It 
has  two  parts:  the  precomputation  which  is  performed  offline  and  the  main  algorithm. 
Its  implementation  requires  the  choice  of  the  following  parameters:  k  is  the  number 
of  nodes  in  the  discretization,  ElS,5,M,N  are  parameters  during  the  precomputation, 
which  will  determine  the  accuracy  properties  of  the  algorithm,  and  finally  £i  is  the 
stopping  criterion  for  the  correction  iterations.] 

Precomputation 

1.  Compute  a  discretization  Ai, . . . ,  \n  of  Sr  by  computing  the  factorization  (22) 
of  the  M  x  N  matrix  (46),  such  that  1 1 AC 1 1 2  <  h. 

2.  Compute  r  as  the  minimum  norm  least  squares  solution  of  precision  £^s  of  the 
system  Ar  =  b,  where  A  and  bi  are  defined  in  (50),  (51). 

Main  algorithm 

Compute  p°(ti), . . .  ,tp°(tk)  by  solving  equation  (1)  with  initial  condition  p{t\)  =  p 0 
via  the  second-order  Runge-Kutta  method  (28). 

Repeat 

1.  For  j  =  1  ,...,n,  compute 

k 

£{tj)  =  +  ^F(ti,pJ(ti))rij  (53) 

i= 1 

where  the  matrix  r  =  (r\j)  has  been  precomputed. 

2.  Compute  7 (t)  at  t\,...,tk  by  solving  equation  (27)  with  the  initial  condition 
q(ti)  =  0  via  the  Runge-Kutta  scheme  (52). 

3.  Set  p3+1(U)  =  pJ(U)  +  7 (ti)  for  ti, . . .  ,tfc. 
until  7 (ti)  <  £1  for  i  —  1, , . . ,  k 

Return  p3+l(ti), . . . ,  p3+l(tk). 
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Algorithm  1.2  (Predictor-Corrector) 

Comment  [  Algorithm  1.2  computes  the  solution  to  initial  value  problem  (1),  (2)  at  a 
given  equidistant  discretization  ti, . . . ,  ijv  yia  a  k-step  PE(CE)m  method.  It  requires 
the  starting  values  . . .  ,<p(tk)  as  input  and  has  two  parts:  the  precomputation 

and  the  main  algorithm.  The  precomputation  requires  the  choice  of  the  parameters 
5,  M ,  N,spls,  which  will  be  introduced  below.] 

Precomputation 

1.  Compute  a  discretization  Ai, . . . ,  An  of  Sr  by  computing  the  factorization  (22) 
of  the  M  x  N  matrix  (46),  such  that  ||A||2  <  S. 

2.  Compute  the  minimum  norm  least  squares  solution  (21)  of  precision  eps  of  the 
system  Ap  =  b ,  where  A  and  b  are  defined  in  (34)  and  (35),  respectively. 

3.  Compute  the  minimum  norm  least  squares  solution  (21)  of  precision  of  the 
system  Ac  =  b ,  where  A  and  b  are  defined  in  (39)  and  (35),  respectively. 

Main  Algorithm 

do  j  =  k,  N  —  1 

1.  Compute  <p(tj+ 1)  via 


k 

<P(tj+l)  =  [PiV^U-V+i)  +Pk+i<Pf(t(j-k)+i)]  ,  (54) 

i=  1 

where  the  vector  p  —  {p\ , . . .  ,p2jt)  has  been  precomputed. 

2.  Compute  p'(tj+1)  =  F  (tj+1,  (p(tj+1)). 

3.  Recompute  (p(tj+ 1)  via 

k 

1)  ^  ^  T  Ck-\-i(p  (^(j— fc)-|-i)]  T  C2k-\-\^P  (tj+ 1);  (55) 

i=  1 


where  the  vector  c  =  (ci, . . . ,  c2fc+i)  has  been  precomputed.  Then  recompute 
ip'(tj+i)  as  in  the  previous  step.  Perform  this  step  m  times. 

end  do 


Return  . . . ,  (p(tN). 


16 


Name 

r 

M 

N 

5 

n 

k 

£d 

£i 

DC1 

3.15 

800 

800 

io-10 

18 

22 

10~9 

o 

1 

o 

DC2 

6.3 

800 

800 

10-19 

46 

60 

o 

1 

00 

10-15 

DC3 

3.15 

800 

800 

IQ-19 

38 

42 

10-39 

10"15 

DC4 

6.3 

800 

800 

10-36 

76 

80 

o 

1 

CO 

to 

10“31 

Table  2:  Implementations  of  the  deferred  correction  scheme  (Algorithm  1.1) 


Name 

r 

M 

N 

5 

n 

k 

£p 

£c 

PCI 

3.15 

800 

800 

10-3° 

18 

22 

10"9 

10“9 

PC2 

6.3 

800 

800 

10”17 

46 

60 

IQ-36 

10-36 

PC3 

3.15 

800 

800 

o 

1 

to 

o 

38 

42 

IQ-39 

o 

1 

00 

PC4 

3.15 

800 

800 

O 

1 

CO 

76 

80 

o 

1 

CO 

o 

IQ-32 

Table  3:  Implementations  of  the  predictor-corrector  scheme  (Algorithm  1.2) 

5  Numerical  Experiments 

We  have  implemented  the  spectral  deferred  correction  scheme  (Algorithm  1.1)  and 
the  predictor-corrector  method  (Algorithm  1.2)  for  different  sets  of  parameters  and 
tested  their  performance.  This  section  discusses  the  results  in  three  steps.  First,  it 
gives  a  detailed  description  of  the  implemented  schemes.  Second,  it  describes  the 
numerical  experiments  establishing  the  schemes’  stability  and  accuracy  properties. 
And,  finally,  the  performance  of  the  implemented  schemes  is  demonstrated  on  two 
examples. 

The  implementation  of  Algorithm  1.1  and  Algorithm  1.2  requires  several  param¬ 
eter  choices.  Both,  Algorithm  1.1  and  Algorithm  1.2,  require  a  discretization  of  the 
complex  semi-disk  Sr  (40).  The  discretization  depends  on  the  radius  of  Sr,  denoted 
by  r,  and  on  the  number  of  rows  and  columns  of  matrix  A  in  (46),  denoted  by  M  and 
N,  respectively.  Furthermore,  it  depends  on  the  accuracy  of  the  factorization  (22), 
i.e.  the  operator  norm  of  X  in  (22),  of  matrix  A  (46),  which  will  be  denoted  by  5. 
Finally,  the  number  of  points  in  the  resulting  discretization  will  be  denoted  by  n. 
The  parameters  for  the  computation  of  the  discretization  of  Sr  employed  in  the  im¬ 
plementations  discussed  in  this  section  are  listed  in  columns  3  —  7  of  Table  2  and 
Table  3. 

Remark  19  In  all  cases,  the  parameters  M  and  N  (the  number  of  columns  and  rows 
in  the  matrix  A  in  (46))  have  been  set  to  800,  and  the  calculations  were  performed 
in  extended  precision  (32  digits).  This  number  was  determined  empirically  for  the 
31-digit  case;  since  it  is  guaranteed  to  be  sufficient  for  all  lower  accuracies  and  results 
in  acceptable  precomputation  times,  no  further  analysis  has  been  performed. 

The  deferred  correction  scheme  (Algorithm  1.1)  computes  the  integration  matrix  r 
in  equation  (49)  by  a  minimum  norm  least  squares  fit  (21).  The  accuracy  of  this  fit, 
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i.e.  £  in  (21),  will  be  denoted  by  £o  and  the  number  of  time  steps  (the  number  of 
columns  of  r)  will  be  denoted  by  k.  The  computation  of  r  is  a  precomputation  and 
has  been  performed  in  extended  precision  (32-digits).  The  £  in  the  stopping  condition 
(Step  4)  of  Algorithm  1.1  is  denoted  by  £j.  In  our  implementation  we  perform  one 
more  iteration  once  this  accuracy  has  been  achieved.  This  section  discusses  four 
different  implementations  of  the  deferred  correction  scheme.  The  parameters  of  each 
implementation  are  listed  in  Table  2. 

The  predictor-corrector  scheme  precomputes  the  predictor  and  corrector  formu¬ 
lae  by  calculating  the  minimum  norm  least  squares  solution  (21)  of  equations  (33) 
and  (38).  The  number  of  time  steps  in  these  formulae  is  denoted  by  k.  The  accuracies 
of  minimum  norm  least  squares  solutions,  i.e.  £  in  (21),  in  the  predictor  and  corrector 
case  are  denoted  by  £p  and  £c,  respectively.  Again,  the  precomputations  have  been 
performed  in  extended  precision  (32  digits).  The  parameters  of  the  implementations 
of  the  predictor  corrector  scheme  discussed  in  this  section  are  listed  in  Table  3.  Fur¬ 
thermore,  as  an  example,  Table  4  lists  the  integration  and  extrapolation  vectors  p 
and  c  of  the  scheme  PCI,  which  are  obtained  during  Step  5  of  the  precomputations 
of  Algorithm  1.2. 

All  schemes  have  been  implemented  in  FORTRAN  using  the  Lahey-Fujitsu  FOR¬ 
TRAN  95  compiler  with  the  optimization  flag  set  to  -o2.  Furthermore,  the  timings 
in  Section  5.2  have  been  measured  on  an  Intel  Pentium  4,  3.2  GHz  with  2  GB  RAM. 

5.1  Stability  and  accuracy  properties 

This  section  discusses  the  stability  and  accuracy  properties  of  the  deferred  correction 
implementations  listed  in  Table  2  and  of  the  predictor-corrector  implementations 
listed  in  Table  3.  Specifically,  we  numerically  construct  the  stability  and  accuracy 
domains  of  the  implemented  schemes,  which  are  defined  as  follows  (see  also,  Sec¬ 
tion  2.2). 

Let  y  denote  the  numerical  solution  to  equations  (6),  (7)  obtained  at  an  equidis¬ 
tant  discretization  of  [0, 1]  with  step-size  h  via  one  of  the  deferred  correction  schemes 
listed  in  Table  2.  The  stability  domain  of  the  scheme  is  the  set  of  all  values  A h  for 
which  |y(l)|  <  1.  Furthermore,  for  a  given  £  the  accuracy  domain  is  defined  to  be 
the  set  of  all  Xh,  for  which 


El-XR)I2 


<  e, 


(56) 


where  y  :  C  t— >  C,  z  i— >  eXz  denotes  the  analytical  solution  to  (6),  (7). 

Similar,  let  <p  denote  the  numerical  solution  to  (6),  (7)  obtained  at  an  equidistant 
discretization  . . .  ,tn  of  [0,1]  of  step-size  h  via  one  of  the  predictor  corrector 

schemes  listed  in  Table  3  with  the  given  starting  values  . . . ,  < p(tk )•  The  stability 

domain  of  the  scheme  is  the  set  of  all  values  Xh  for  which  |y(fj)|  <  1  holds  for  all 
i  =  1  The  latter  condition  is  satisfied  if  the  largest  singular  value  of  the 

matrix  B  defined  in  equation  (10)  is  less  than  one. 
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c 


_ p _ 

5.6950867745996450  x  Hr'-' 

1.2471828303461370 x  10“2 
2.8894215936093530  x  10“2 
2.4083554081417720  x  10“2 
1.7345252832690130 x  10“2 
2.8519730870305910  x  10“2 
3.6071920983279110  x  10“2 
2.8966177271448650  x  10"2 
2.2315178842523700  x  10“2 
2.9905186291956070  x  10“2 
4.4440048810368700  x  10“2 
4.9069767898691480  x  10“2 
4.0650298910123330  x  10“2 
3.4669719074163200  x  10“2 
4.5611440916694790 x  10“2 
6.6034586913433500  x  10“2 
7.2628793760997530  x  10“2 
5.9216748674729610  x  10“2 
5.6406142035183890  x  10“2 
8.9720539463978590  x  10“2 
1.0201521115897360 x  10"1 
5.4012787919006680  x  10“2 
6.0560922579890260  x  10“3 
-2.7807762994373060  x  10“2 
1.1956698573336000 x  10"1 
-1.4368259417051200  x  10"1 
4.0276974389784590  x  10“2 
1.4372084163410750 x  10"1 
-6.8513418990535620  x  10“3 
-1.0944090055846840  x  10"1 
-2.0192137915885270  x  10“3 
1.4595770836923140 x  10"1 
1.2479716863929200 x  10"1 
-3.4116614768704690  x  10“2 
-1.1144883327854800  x  10"1 
1.5733708304525340 x  10“2 
2.0002981731434640  x  10"1 
1.8114967731355230 x  10"1 
-5.8479574699884930  x  10"2 
-1.6214192347450070  x  10"1 
1.6621388638955530 x  10"1 
4.2830160005071360  x  10"1 
-3.5531444249916690  x  10"1 
3.1221966556634160 x  10"1 


2.2410253811826690  x  10”2 
2.4633154278189170  x  10"2 
2.4336066393818950  x  10”2 
2.5149950648957660  x  10”2 
2.6486814374770810  x  10”2 
2.7234737421409750  x  10”2 
2.8198076103084640  x  10”2 
3.0096942089726130  x  10"2 
3.2486068591206280  x  10"2 
3.4641908350425040  x  10~2 
3.6616005690985700  x  10”2 
3.9130905383232570  x  10”2 
4.2655375218158010  x  10"2 
4.6832060662658940  x  10"2 
5.0917965571713190 x  10”2 
5.4758986719566050  x  10”2 
5.9165803238876420  x  10”2 
6.5051564452405580  x  10”2 
7.2079207171644550  x  10~2 
7.8700198716440890  x  10”2 
8.4738392656218300  x  10“2 
9.3679562525744830  x  10"2 
3.7583455331903450  x  10”4 
5.8682877788308760  x  10”3 
-7.0838293847688030  x  10~4 
1.5022356394256900  x  10”2 
1.0788479881224430  x  10”2 
6.8440197117482410  x  10”3 
1.4847045428708830  x  10”2 
2.4033339162535270  x  10"2 
2.4650259312919750  x  10”2 
2.0743990881309220  x  10~2 
2.2174318623909730  x  10”2 
3.1578482298307720  x  10”2 
4.1718536023450730  x  10”2 
4.4487205746717590  x  10"2 
4.0960879667910390  x  10”2 
4.1289524931745020  x  10”2 
5.3143311356800410  x  10”2 
6.9904178604996720  x  10”2 
7.4460084585029470  x  10~2 
6.3619818896987330  x  10”2 
7.0261382310043240  x  10"2 
1.2124592734310950  x  10"1 
3.1145960381730385  x  10”2 


Table  4:  The  vectors  p  and  c  for  the  scheme  PCI 
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Figure  2:  Stepping  scheme  for  the  stability  and  accuracy  domain  search 


For  a  given  e,  the  accuracy  domain  of  the  predictor  corrector  schemes  listed  in 
Table  3  is  defined  to  be  the  set  of  all  A h,  for  which 


EIU+X^I2 


<  e, 


(57) 


where  <p  :  C  i— >  C,  z  i— >  eXz  denotes  the  analytical  solution  to  equations  (6),  (7). 

Since  the  numerical  solution  tp  to  (6),  (7)  is  an  analytical  function  of  A  for  both, 
the  deferred  correction  and  the  predictor  corrector  schemes,  and  furthermore,  the 
analytical  solution  ip  to  (6),  (7)  is  an  analytical  function  of  A  as  well,  it  follows  from 
the  maximum  principle  (Lemma  8)  that  the  stability  and  accuracy  domains  have  well- 
defined  boundaries.  In  the  case  of  the  stability  domain  the  boundary  is  the  set  of 
points  for  which  |</?(1)|  equals  one.  In  the  case  of  the  accuracy  domain  the  boundary 
is  the  set  of  points  for  which  the  lefthand  sides  of  (56),  (57)  equals  e. 

The  numerical  construction  of  the  boundaries  is  illustrated  in  Figure  2.  First, 
the  secant  method  (30)  is  used  to  search  for  the  boundary  point  Z\  G  C  along  the 
imaginary  axis  in  the  upper  half  plane.  If  the  secant  method  fails  to  converge,  z\  is 
computed  by  bisection  (29).  Second,  a  small  step  is  taken  from  z\  into  the  negative 
half  plane  along  a  line  orthogonal  to  the  imaginary  axis  ending  at  the  point  z'2. 
The  secant  method  is  used  to  search  for  the  next  boundary  point  z 2  within  a  small 
interval  around  z'2  along  the  line  which  passes  through  z2  and  is  orthogonal  to  the  line 
through  zi  and  z2  .  As  before,  if  the  secant  method  fails,  Z2  is  computed  by  bisection. 
Next,  a  small  step  is  taken  starting  from  Z2  along  the  vector  from  z\  to  Z2  and  ending 
at  z'3.  The  whole  procedure  is  repeated  until  a  step  crosses  the  imaginary  axis  in  the 
upper  half  plane. 

All  stability  domains  were  computed  in  double  precision  (16  digits).  The  accuracy 
domains  were  computed  in  double  precision  (16  digits),  if  e  >  10”12,  and  in  extended 
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precision  (32  digits)  for  the  cases  of  smaller  e. 

Figure  3  shows  the  stability  domains  and  the  accuracy  domains  for  various  £ 
for  the  deferred  correction  schemes  listed  in  Table  2,  while  Figures  4  and  5  show 
the  domains  of  the  predictor-corrector  schemes  listed  in  Table  3.  In  each  figure  the 
accuracy  domains  are  marked  by  the  label  “e  =  x” ,  where  x  stands  for  the  specific 
choice  of  e  for  the  domain  in  question.  The  stability  domains  are  indicated  by  the 
label  “5"’.  For  each  predictor-corrector  scheme  the  stability  domains  are  shown  for 
the  cases  of  one,  two  and  three  correction  steps,  labeled  by  “m  =  1”,  “m  =  2”  and 
“m  =  3” ,  respectively. 

The  presented  figures  and  our  more  detailed  numerical  experiments  motivate  the 
following  observations. 

1.  The  specific  choice  of  the  discretization  parameters  has  only  a  minor  effect  on  the 
properties  of  the  resulting  deferred  correction  or  predictor- corrector  scheme,  as 
long  as  M  and  N  are  sufficiently  large  (in  our  experiments  800)  and  6  is  smaller 
than  the  desired  accuracy  of  the  scheme. 

2.  The  stability  domains  of  the  deferred  correction  schemes  are  quite  insensitive 
to  changes  in  the  parameters  k,  £d  and  £j  and,  hence,  the  stability  domains  of 
the  schemes  DC1-DC4  are  quite  similar. 

3.  The  accuracy  domain  for  a  deferred  correction  scheme  tends  to  be  close  to 
(but  slightly  smaller  than)  the  one  prescribed.  For  example,  in  the  case  of 
DC1  with  r  =  3.15,  k  =  22  the  scheme  is  constructed  to  obtain  the  precision 
£r>  =  10-9  within  a  semi-circle  in  the  Ah-plane  with  radius  0.287  (since  during 
the  construction  h  —  2/k  ~  0.091).  Figure  3(a)  shows,  that  as,  a  matter  of  fact, 
DC1  obtains  the  precision  ICC8  within  an  approximate  semi-circle  of  radius  0.28, 
which  corresponds  to  22  steps  per  wavelength. 

Consequently,  it  is  straightforward  to  construct  spectral  deferred  correction 
schemes,  which  obtain  a  desired  precision.  The  schemes  we  implemented  pro¬ 
duce  8  digits  at  22  steps  per  wavelength  (DC1),  16  digits  at  30  steps  per  wave¬ 
length  (DC2),  15  digits  at  30  steps  per  wavelength  (DC3)  and  28  digits  at  42 
steps  per  wavelength  (DC4). 

4.  As  expected,  in  the  case  of  the  predictor- corrector  schemes  the  corrector  step  is 
significantly  more  stable  than  the  predictor  step.  Hence,  the  stability  domains  of 
the  predictor- corrector  schemes  increase  with  the  number  of  correction  steps  m. 
Overall,  the  stability  domains  of  the  predictor-corrector  schemes  are  very  sen¬ 
sitive  to  the  choice  of  k,  £p  and  £c ■  Increasing  k,  while  fixing  £p  and  £c  will 
increase  the  stability  domain.  Consequently,  a  straightforward  procedure  for 
choosing  k  is  to  fix  £P  and  £q  at  the  desired  precision  and  to  increase  k  un¬ 
til  the  stability  domain  with  m  —  1  encompasses  the  accuracy  domain  of  the 
desired  e. 

5.  Similar  to  the  spectral  deferred  correction  schemes,  the  accuracy  domains  of 
the  predictor-corrector  schemes  is  close  to  the  one  prescribed,  making  it  rather 
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straightforward  to  construct  schemes  with  a  specific  precision.  The  schemes  we 
implemented  produce  7  digits  at  22  steps  per  wavelength  (PCI),  15  digits  at  38 
steps  per  wavelength  (PC2),  15  digits  at  36  steps  per  wavelength  (PC3)  and  28 
digits  at  66  steps  per  wavelength. 

6.  Outside  the  region  where  a  deferred  correction  scheme  or  a  predictor-corrector 
scheme  achieves  its  maximal  accuracy,  the  obtained  precision  decays  slowly.  For 
example,  at  21  steps  per  wavelength  the  scheme  DC4  still  obtains  15  digits. 

5.2  Numerical  examples 

In  this  section  we  illustrate  the  performance  of  the  presented  schemes  on  two  standard 
examples  for  non-stiff  problems:  the  Bessel  differential  equation  and  the  Jacobi  elliptic 
functions. 

For  both  examples  of  this  section  we  have  used  each  of  the  four  predictor-corrector 
schemes  listed  in  Table  3  with  one  corrector  step  per  time  step  to  obtain  a  numerical 
solution  at  various  equidistant  discretizations.  Each  of  these  schemes  requires  k  (see 
Column  7  of  Table  3)  starting  values,  which  are  computed  via  the  corresponding 
deferred  correction  schemes  listed  in  Table  2.  Computations  have  been  performed  in 
double  (16-digit)  precision  (schemes  DC1+PC1,  DC2+PC2  and  DC3+PC3),  as  well 
as  in  extended  (32-digit)  precision  (schemes  DC2+PC2  and  DC4+PC4).  For  each 
example  we  have  computed  several  measures  of  performance  (see  Tables  5-12)  and 
compared  them  to  other  numerical  ODE  solvers  (see  Figures  8-10  and  13-15). 

The  benchmark  methods  we  use  as  a  comparison  are  highly  optimized  Mathe- 
matica  implementations  of  a  Runge-Kutta  scheme  and  an  Adams-Bashforth-Moulton 
predictor-corrector  method.  The  Runge-Kutta  scheme  corresponds  to  the  Mathemat- 
ica  function  “NDSolve”  with  the  option  “Method— >ExplicitRungeKutta” .  It  adap¬ 
tively  changes  the  order  of  the  Runge-Kutta  method  between  1  and  8.  The  Adams- 
Bashforth-Moulton  scheme  corresponds  to  the  function  “NDSolve”  with  the  option 
“Method— >•  Adams” .  It  changes  the  order  of  accuracy  adaptively  between  1  and  12. 
Both  methods  adaptively  control  the  step-size  to  achieve  a  specified  accuracy,  i.e.  the 
discretization  nodes  at  which  the  solution  is  returned  depend  on  the  specific  problem. 

Remark  20  We  have  chosen  the  Mathematica  routines  because  their  performance 
appears  to  be  representative  of  state-of-the-art  direct  ODE  solvers.  It  should  be  noted, 
however,  that  the  additional  layers  of  order  and  step-size  control  of  the  Mathematica 
routines,  as  well  as  the  different  compiler  environment,  make  a  direct  comparison  of 
the  underlying  algorithms  problematic.  The  performance  comparison  should  therefore 
be  seen  more  as  an  indication  than  as  an  absolute  measure  of  the  relative  performance. 

5.2.1  Bessel  differential  equation 

The  Bessel  function  Jn  of  order  n  is  analytic  in  the  whole  plane  and  satisfies  the 
differential  equation  (see,  for  example,  [1]) 

x2  J'n{x)  +  x  J'n(x)  +  ( x 2  -  n2)Jn(x )  =  0.  (58) 
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Figure  6:  Plot  of  the  Bessel  function  J50 


The  asymptotic  behavior  of  Jn  is  given  by  the  formula 


JrX'J') 


+  0 


(59) 


indicating  that  for  sufficiently  large  x  any  Bessel  function  oscillates  with  a  frequency 
of  approximately  1. 

Reducing  equation  (58)  to  the  standard  form  (1)  by  introducing  the  auxiliary 
variable  a  =  J'n  yields  a  system  of  two  coupled  equations.  We  solve  this  system  for 
the  Bessel  function  J50  in  the  interval  [50, 15000],  where  the  initial  values  </5o(50)  and 
J50(50)  are  given.  A  graph  of  J50  on  the  interval  [0, 100]  is  depicted  in  Figure  6. 

Tables  5,  6,  7  and  8  summarize  the  results  for  the  schemes  DC1+PC1,  DC2+PC2, 
DC3+PC3  and  DC4+PC4,  respectively.  Each  table  lists  the  following  measures  of 
performance.  Column  1  shows  the  number  of  nodes  ns  in  the  equidistant  discretization 
of  [50,15000],  while  Column  2  shows  the  corresponding  step-size  h.  Column  3  lists 
the  number  of  evaluations  of  the  right  hand  side  of  (58)  required  by  the  deferred 
correction  scheme  to  compute  the  k  starting  values.  Furthermore,  Column  4  lists 
the  number  of  evaluations  required  by  the  predictor-corrector  scheme  to  compute  the 
ns  —  k  remaining  values.  Column  5  shows  the  total  CPU  time  in  seconds,  where 
the  timings  have  been  measured  on  an  Intel  Pentium  4,  3.2  GHz  with  2  GB  RAM, 
as  mentioned  above.  The  computations  via  the  schemes  DC1+PC1,  DC2+PC2  and 
DC3+PC3  have  been  performed  in  double  (16-digit)  precision,  while  the  computations 
via  the  scheme  DC4+PC4  have  been  performed  in  extended  (32-digit)  precision. 

Finally,  Column  6  lists  the  /2-error  of  the  resulting  solution,  which  was  computed 
as  follows.  If  (p  denotes  the  analytical  solution  of  the  Bessel  differential  equation,  and 
tp  denotes  the  numerical  solution,  obtained  at  the  discretization  ti, ...  ,tna,  then  we 
compute  the  Z2-error  of  the  numerical  solution  in  question  at  the  last  200  nodes  as 


'£fclns-200  N4)-^)|2 
"  £^-20c>(4)|2  ' 


(60) 


26 


ns 

h 

Ndc 

Npc 

t/  sec 

ep 

38,000 

0.3934 

345 

75, 956 

0.027 

1.18  x  10"U2 

42,000 

0.3560 

345 

83,956 

0.029 

1.67  x  10-°3 

46,000 

0.3250 

280 

91,956 

0.031 

1.92  x  10“04 

50,000 

0.2990 

280 

99,956 

0.034 

2.19  x  lO"06 

Table  5:  Performance  of  the  schemes  DC1+PC1 

on  the  Bessel  differential  equation 

ns 

h 

rinc 

nPC 

t/  sec 

52,000 

0.2875 

1,490 

103,880 

0.078 

4.99  x  10"06 

56,000 

0.2670 

1,311 

111,880 

0.085 

5.39  x  10"08 

60,000 

0.2492 

1,311 

119,880 

0.089 

4.27  x  10"09 

64, 000 

0.2336 

1,311 

127,880 

0.096 

2.91  x  10"10 

68,000 

0.2199 

1,132 

135,880 

0.103 

5.40  x  10"11 

Table  6:  Performance  of  the  schemes  DC2+PC2  on  the  Bessel  differential  equation 


ns 

h 

nDc 

nPC 

t/  sec 

82,500 

0.1812 

665 

164,916 

0.092 

3.26 

X 

10' 

-06 

82,550 

0.1811 

665 

165,016 

0.092 

5.68 

X 

io- 

-08 

82,600 

0.1810 

790 

165,116 

0.092 

2.33 

X 

io- 

-10 

82,650 

0.1809 

665 

165,216 

0.092 

1.90 

X 

io- 

-10 

82, 700 

0.1808 

665 

165,316 

0.092 

9.27 

X 

io- 

-11 

Table  7:  Performance  of  the  schemes  DC3+PC3  on  the  Bessel  differential  equation 


n.s 

h 

nDc 

npc 

t/  sec 

115,000 

0.1300 

2,707 

229, 840 

45.9 

2.89 

X 

io- 

-17 

120,000 

0.1246 

2,468 

239, 840 

47.9 

6.17 

X 

io- 

-18 

130,000 

0.1150 

2,468 

259, 840 

51.9 

3.27 

X 

io- 

-19 

140, 000 

0.1068 

2,468 

279,  840 

55.9 

1.95 

X 

io- 

-20 

150,000 

0.0996 

2,468 

299, 840 

59.8 

1.22 

X 

io- 

-21 

160,000 

0.0934 

2,229 

319,840 

63.9 

7.48 

X 

io- 

-23 

170,000 

0.0879 

2,229 

339, 840 

67.7 

3.90 

X 

io- 

-24 

180,000 

0.0830 

2,229 

359, 840 

71.7 

1.42 

X 

io- 

-25 

190,000 

0.0786 

2,229 

379, 840 

75.5 

3.63 

X 

io- 

-27 

195,000 

0.0766 

2,229 

389, 840 

77.5 

3.83 

X 

io- 

-27 

Table  8:  Performance  of  the  schemes  DC4+PC4  on  the  Bessel  differential  equation 
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(a)  RHS  evaluations 


(b)  time  in  sec. 


Figure  7:  Performance  comparison  for  the  Bessel  equation 


Figure  8:  Required  RHS  evaluations  for  the  Bessel  equation 


t/ sec 


Figure  9:  Number  of  steps  for  the  Bessel  equation 


Figure  10:  Computation  times  for  the  Bessel  equation 
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In  our  numerical  calculations  the  values  of  ip  =  J 50  are  computed  via  a  standard 
numerical  procedure  for  the  evaluation  of  the  Bessel  function  (see,  for  example,  [1]). 

Figures  7(a)  and  7(b)  illustrate  the  difference  in  performance  of  the  schemes 
DC1+PC1,  DC2+PC2  and  DC3+PC3  for  solving  the  Bessel  differential  equation  (58) 
of  order  50  with  double  (16-digit)  precision.  Figure  7(a)  shows  the  total  number  of 
evaluations  Nevai  =  ripe  +  npc  of  the  right  hand  side  of  (1)  versus  the  /2-error  sp 
as  defined  in  equation  (60),  i.e.  the  obtained  accuracy.  Figure  7(b)  shows  the  total 
CPU  time  taken  by  the  calculation  versus  the  Z2-error  ep. 

Finally,  Figures  8-10  show  a  comparison  of  the  performance  of  the  schemes  PC3 
and  PC4  to  the  performance  of  the  Mathematica  implementations  of  a  Runge-Kutta 
scheme  and  an  Adams-Bashforth-Moulton  scheme,  as  described  above.  Figure  8  shows 
the  total  number  of  evaluations  Nevai  =  npc  +  npc  of  the  right  hand  side  of  (1)  versus 
the  achieved  accuracy  (the  Z2-error  ep).  Figure  9  shows  the  number  of  steps  ns  in  the 
integration  interval  versus  the  achieved  accuracy,  and  Figure  10  shows  the  total  CPU 
time  t  versus  the  achieved  accuracy.  All  axes  in  Figures  8-10  use  a  logarithmic  scale 
and  all  the  computations  for  these  figures  have  been  performed  in  extended  (32-digit) 
precision. 

Remark  21  The  kinks  in  the  graphs  for  the  Runge-Kutta  and  Adams  schemes  are 
artifacts  due  to  the  adaptive  order  and  step-size  control. 

Several  observations  derived  from  the  above  results  are  summarized  at  the  end  of 
this  section. 


5.2.2  Jacobi  elliptic  functions 

The  Jacobi  elliptic  functions  sn,  cn,  dm  are  another  commonly  used  example  for  a 
non-stiff  problems  (see,  for  example,  [1,  10,  6]).  The  Jacobi  elliptic  functions  with 
parameter  m  €  R  satisfy  the  equations 

sn'(t)  =  cn(t )  ■  dn(t )  (61) 

cn{t)  =  — sn(t )  •  dn(t)  (62) 

dm! [t)  =  -m-sn{t)-cn{t).  (63) 

We  apply  our  schemes  to  solving  this  system  on  the  interval  [0,  2000]  for  the  case 
m  =  0.5  with  the  initial  values  sn(0)  =  0,  cn( 0)  =  1,  dn{ 0)  =  1.  The  analytical 
solution  for  this  case  can  be  expressed  as  the  series’  (see,  for  example,  [1]) 


2t t  “ 1  „n+ 1/2 

Sn{t)  =  y/lj2K  ^  1  ~  g2n+1  Sln((2W  + 

2t t  00  nn+ 1/2 

cn(t)  =  cos((2n + 1)>'). 
O  OO  Y) 

7  ,  .  7 r  Z7T  Q 

dn[t)  =  2K  +  1<^  TTlp  cos(2n!')’ 

71=1  1 


(64) 

(65) 

(66) 
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Figure  11:  Plot  of  the  Jacobi  elliptic  functions  sn,  cn,  dn  with  m  =  | 


where  q  =  e  n,  v  —  ^  and 

r/2  de  ,  , 

Jl  =  /  =  «  1.85.  (67) 

Jo  sin2  6 

The  first  25  terms  of  the  series’  (64),  (65),  (66)  are  sufficient  to  yield  the  result  with 
32-digit  accuracy.  Figure  11  shows  a  plot  of  the  Jacobi  elliptic  functions  with  m  =  0.5 
in  the  interval  [0, 12];  they  are  periodic  with  period  4  K. 

Analogous  to  the  previous  example,  we  have  used  the  schemes  DC1+PC1,  DC2+ 
PC2,  DC3+PC3  and  DC4+PC4  listed  in  Tables  2  and  3  with  one  corrector  step 
per  time  step  to  obtain  a  solution  to  system  (61),  (62),  (63)  at  various  equidistant 
discretization  of  the  interval  [0,  2000].  The  computations  for  the  schemes  DC1+PC1, 
DC2+PC2,  DC3+PC3  have  been  performed  in  double  (16-digit)  precision,  while  the 
computations  for  the  scheme  DC4+PC4  have  been  performed  in  extended  (32-digit) 
precision.  The  results  of  these  tests  are  summarized  in  Tables  10-12. 

As  for  the  previous  example,  Column  1  of  each  table  lists  the  number  of  nodes  ns 
in  the  equidistant  discretization  of  the  integration  interval,  while  Column  2  shows  the 
resulting  step-size  /i;  Column  3  lists  the  number  of  evaluations  of  the  right  hand  side 
of  the  system  (61),  (62),  (63)  required  by  the  deferred  correction  scheme  to  compute 
the  k  required  starting  values,  and  Column  4  lists  the  number  of  evaluations  required 
by  the  predictor-corrector  scheme  to  compute  the  ns  —  k  remaining  values. 

Column  5  lists  the  average  /2-error,  which  was  computed  as  follows.  If  tp  denotes 
the  analytical  solution  of  any  of  the  functions  sn,  cn  or  dn,  obtained  via  series  (64), 
(65)  or  (66),  and  p  denotes  the  corresponding  numerical  solution,  obtained  at  the 
discretization  ti, ...  ,tHs,  then  the  /2-error  of  this  numerical  solution  is  given  by  equa¬ 
tion  (60).  The  values  of  p  are  computed  in  extended  precision  (32-digits)  using  the  25 
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first  terms  of  the  series’  (64),  (65),  (66).  In  all  of  our  computations  the  errors  £p  (sn), 
£p(cn),  £p(dn )  were  of  approximately  the  same  size.  In  Tables  10-12,  Column  5 
reports  the  average 

1 

£i2  =  g  (U2  {sn)  +  ep  (era)  +  £p  (dn) ) .  (68) 

Figure  12  illustrates  graphically  the  difference  in  the  number  of  required  func¬ 
tion  evaluations  of  the  schemes  DC1+PC1,  DC2+PC2,  DC3+PC3,  DC4+PC4  for 
various  accuracies  on  a  logarithmic  scale.  As  for  the  tables,  the  results  in  this  fig¬ 
ure  for  the  schemes  DC1+PC1,  DC2+PC2  and  DC3+PC3  have  been  computed  in 
double  (16-digit)  precision,  while  the  results  for  the  scheme  DC4+PC4  have  been 
computed  in  extended  (32-digit)  precision.  Finally,  Figures  13-15  show  a  peformance 
comparison  between  the  scheme  DC4+PC4  and  the  Mathcmatica  implementations  of 
a  Runge-Kutta  scheme  and  an  Adams-Bashforth-Moulton  scheme.  Figure  13  shows 
the  number  of  evaluations  of  the  right  hand  side  of  system  (61),  (62),  (63)  versus  the 
achieved  accuracy  (the  average  /2-error  ip ) .  Figure  14  shows  the  number  of  taken 
steps  ns  versus  the  achieved  accuracy  and  Figure  15  shows  the  total  CPU  time  versus 
the  achieved  accuracy.  All  computations  for  Figures  13-15  have  been  performed  in 
extended  (32-digit)  precsion  and  the  axes  of  Figures  13-15  use  logarithmic  scales. 

5.2.3  Observations 

The  numerical  experiments  reported  in  the  preceding  two  subsections  (as  well  as  other 
numerical  experiments  we  have  performed)  motivate  the  following  observations. 

1.  For  any  of  the  predictor-corrector  schemes  (with  a  fixed  number  of  correction 
steps)  there  exists  a  problem-dependend  range  of  step-sizes  for  which  the  scheme 
behaves  similiar  to  a  classically  convergent  scheme:  any  decrease  in  step-size  h 
will  increase  the  accuracy.  However,  once  the  step-size  h  is  sufficiently  small, 
further  decreasing  the  step-size  will  not  result  in  further  accuracy  improvements. 
On  the  other  hand,  if  the  step-size  becomes  too  large,  the  scheme  will  become 
unstable.  For  example,  in  the  Bessel  case  the  scheme  PC2  becomes  stable  at  the 
step-sizes  h  <  0.29  and  decreasing  the  step-size  from  h  =  0.29  to  h  =  0.22  will 
will  improve  the  accuracy  from  6  digits  to  11  digits.  The  scheme  PC3  becomes 
stable  at  h  =  0.18,  where  it  achieves  10  digits  and  further  decreases  in  the 
step-size  h  will  not  improve  that  accuracy.  If  the  scheme  PC3  is  run  with  two 
correction  steps,  its  stability  region  will  increase  and  it  will  achieve  reasonable 
accuracies  for  step-sizes  bigger  than  h  =  0.18. 

2.  The  performance  of  the  predictor-corrector  schemes  in  the  examples  above  could 
have  been  predicted  from  the  accuracy  and  stability  domains  shown  in  Figures  4 
and  5.  For  example,  in  the  Bessel  case  where  the  maximal  frequency  A  ~  1 
Figure  4(c)  shows  that  the  scheme  PC2  with  1  correction  becomes  stable  for 
steplcngths  h  <  0.3.  Furthermore,  Figure  4(d)  shows  that  at  h  ~  0.3  the 
scheme  will  achieve  8  digits  and  at  h  ~  0.22  it  will  achieve  15  digits.  The  fact 
that  Table  6  shows  slightly  lower  accuracies  is  due  to  round-off  errors,  which 
accumulated  over  the  long  integration  interval. 
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ns 

h 

nDc 

ripe 

t/  sec 

ep 

8,000 

0.2500 

670 

15,956 

0.0069 

1.59  x  10”U1 

10,000 

0.2000 

540 

19,956 

0.0080 

2.51  x  10~02 

12,000 

0.1667 

410 

23,956 

0.0094 

1.60  x  10~02 

14,000 

0.1429 

345 

27,956 

0.0108 

1.88  x  10"03 

16,000 

0.1250 

345 

31,956 

0.0126 

3.67  x  10-°4 

18,000 

0.1111 

280 

35,956 

0.0140 

2.31  x  lO"04 

Table  9:  Performance  of  the  schemes  DC1+PC1  on  the  Jacobi  differential  equations 


ns 

h 

nDc 

nPC 

t/  sec 

10,000 

0.2000 

3,638 

19,880 

0.019 

3.90 

X 

10- 

-01 

14,000 

0.1429 

2,564 

27,880 

0.025 

3.48 

X 

io- 

-02 

18,000 

0.1111 

2,206 

35,880 

0.033 

4.96 

X 

io- 

-03 

22,000 

0.0909 

1,848 

43,880 

0.039 

6.51 

X 

io- 

-04 

26,000 

0.0769 

1,490 

51,880 

0.047 

3.32 

X 

io- 

-05 

30,000 

0.0667 

1,132 

59,880 

0.053 

7.94 

X 

io- 

-07 

34, 000 

0.0588 

953 

67,880 

0.060 

2.17 

X 

io- 

-08 

38,000 

0.0526 

953 

75, 880 

0.068 

5.96 

X 

io- 

-10 

42,000 

0.0476 

953 

83,880 

0.075 

4.86 

X 

io- 

-12 

46,000 

0.0435 

953 

91,880 

0.082 

2.13 

X 

10- 

-12 

Table  10:  Performance  of  the  schemes  DC2+PC2  on  the  Jacobi  differential  equations 


n.s 

h 

nDc 

nPC 

t/  sec 

16,000 

0.1250 

1,415 

31,916 

0.021 

5.04 

X 

io- 

-03 

20,000 

0.1000 

1,040 

39,916 

0.026 

2.39 

X 

io- 

-06 

24, 000 

0.0833 

790 

47,916 

0.032 

7.49 

X 

io- 

-08 

28,000 

0.0714 

665 

55,916 

0.036 

6.31 

X 

io- 

-09 

32,000 

0.0625 

665 

63,916 

0.042 

1.49 

X 

io- 

-09 

36,000 

0.0556 

665 

71,916 

0.047 

3.11 

X 

io- 

-11 

40,000 

0.0500 

665 

79,916 

0.052 

1.73 

X 

io- 

-11 

Table  11:  Performance  of  the  schemes  DC3+PC3  on  the  Jacobi  differential  equations 
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ns 

h 

20,000 

0.1000 

40,000 

0.0500 

60,000 

0.0333 

80,000 

0.0250 

100,000 

0.0200 

120,000 

0.0167 

140,000 

0.0143 

160,000 

0.0125 

180,000 

0.0111 

riDC 

ripe 

6292 

39, 840 

3424 

79, 840 

2229 

119,840 

2229 

159,840 

1990 

199,840 

1990 

239, 840 

1990 

279,  840 

1751 

319,840 

1751 

359, 840 

t/  sec  sp 


12.1 

1.44 

X 

10- 

-02 

23.7 

7.02 

X 

io- 

-11 

35.4 

1.07 

X 

io- 

-15 

47.1 

5.37 

X 

io- 

-19 

58.8 

6.76 

X 

io- 

-22 

70.5 

3.23 

X 

io- 

-24 

82.3 

9.05 

X 

io- 

-26 

94.2 

5.15 

X 

io- 

-27 

106 

5.73 

X 

io- 

-27 

Table  12:  Performance  of  the  schemes  DC4+PC4  on  the  Jacobi  differential  equations 


Figure  12:  Required  RHS  evaluations  for  the  Jacobi  elliptic  functions 
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Figure  13:  Required  RHS  evaluations  for  the  Jacobi  elliptic  functions 
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Figure  14:  Number  of  steps  for  the  Jacobi  elliptic  functions 
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Figure  15:  Computation  times  for  the  Jacobi  elliptic  functions 


3.  The  convergence  rate  of  the  predictor-corrector  schemes  of  this  paper  is  signihn- 
cantly  higher  than  the  convergence  rate  of  a  high-order  Runge-Kutta  method 
or  a  high-order  Adams  predictor  corrector  method  (See  Figures  8-10  and  13-15, 
please  note  that  the  scales  are  logarithmic).  In  terms  of  the  number  of  steps  the 
Runge-Kutta  scheme  does  best  for  low  accuracies;  for  higher  accuracies  (12  dig¬ 
its  or  more)  the  schemes  of  this  paper  become  superior.  In  terms  of  the  number 
of  RHS  evaluations  and  the  actual  timings  our  schemes  are  superior  as  soon 
as  5  digits  or  more  are  required. 

6  Conclusions  and  future  work 

We  have  presented  a  new  class  of  predictor-corrector  schemes,  along  with  spectral 
deferred  correction  schemes  for  their  initialization.  Our  experiments  indicate  that  on 
a  wide  range  of  problems  the  schemes  of  this  paper  outperform  serveral  state-of-the- 
art  ODE  solvers,  particularly  when  high  accuracy  is  required. 

This  paper  discusses  explicit  schemes,  which  are  applicable  to  non-stiff  problems. 
Extension  of  this  work  to  stiff  environments  is  in  progress,  and  will  be  reported  at  a 
later  date. 
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